Human-Machine Cooperation in Large-Scale Multimedia Retrieval: A Survey

نویسندگان

  • Kimiaki Shirahama
  • Marcin Grzegorzek
  • Bipin Indurkhya
چکیده

Large-Scale Multimedia Retrieval (LSMR) is the task to fast analyze a large amount of multimedia data like images or videos and accurately find the ones relevant to a certain semantic meaning. Although LSMR has been investigated for more than two decades in the fields of multimedia processing and computer vision, a more interdisciplinary approach is necessary to develop an LSMR system that is really meaningful for humans. To this end, this paper aims to stimulate attention to the LSMR problem from diverse research fields. By explaining basic terminologies in LSMR, we first survey several representative methods in chronological order. This reveals that due to prioritizing the generality and scalability for large-scale data, recent methods interpret semantic meanings with a completely different mechanism from humans, though such humanlike mechanisms were used in classical heuristic-based methods. Based on this, we discuss human-machine cooperation, which incorporates knowledge about human interpretation into LSMR without sacrificing the generality and scalability. In particular, we present three approaches to human-machine cooperation (cognitive, ontological, and adaptive), which are attributed to cognitive science, ontology engineering, and metacognition, respectively. We hope that this paper will create a bridge to enable researchers in different fields to communicate about the LSMR problem and lead to a ground-breaking next generation of LSMR systems. Correspondence: Correspondence concerning this article should be addressed to Kimiaki Shirahama, Pattern Recognition Group, University of Siegen, Hoelderlinstrasse 3, 57076 Siegen, Germany, or via email to [email protected].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey on Interactive Video Retrieval Using Active Learning Approach

Active learning is a machine learning technique which chooses the most informative models for labelling and uses them as training data. It has been extensively explored in multimedia research area for reducing human annotation effort. In this article, efforts of active learning in multimedia annotation and retrieval have been surveyed .The application domains such as image or video annotation, ...

متن کامل

Large-Scale Multimedia Retrieval and Mining

R ecent years have witnessed an explosive growth of multimedia data due to higher processor speeds, faster networks, wider availability of high-capacity mass-storage devices, and the advent of cloud computing. Stimulated by current work in scalable machine learning, feature indexing and multimodal analysis techniques, researchers are increasingly interested in exploring challenges and new oppor...

متن کامل

Large-Scale Multimedia Retrieval and Mining [Guest editors' introduction]

Rahul Sukthankar Intel Labs and Carnegie Mellon University R ecent years have witnessed an explosive growth of multimedia data due to higher processor speeds, faster networks, wider availability of high-capacity mass-storage devices, and the advent of cloud computing. Stimulated by current work in scalable machine learning, feature indexing and multimodal analysis techniques, researchers are in...

متن کامل

A Survey of Content-Based Image Retrieval Systems using Scale-Invariant Feature Transform (SIFT)

Content-based image retrieval (CBIR) is a method for finding similar images from large image databases. As the network and development of multimedia technologies are becoming more popular, users are not satisfied with the traditional information retrieval techniques. In recent years, local descriptors are used as image features to improve the performance of CBIR. The SIFT is one of the most loc...

متن کامل

Exploiting Modern Hardware for High-Dimensional Nearest Neighbor Search. (Exploitation du matériel moderne pour la recherche de plus proche voisin en haute dimensionnalité)

Many multimedia information retrieval or machine learning problems require efficient high-dimensional nearest neighbor search techniques. For instance, multimedia objects (images, music or videos) can be represented by high-dimensional feature vectors. Finding two similar multimedia objects then comes down to finding two objects that have similar feature vectors. In the current context of mass ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016